Dialog Act Modeling for Conversational Speech
نویسندگان
چکیده
We describe an integrated approach for statistical modeling of discourse structure for natural conversational speech. Our model is based on 42 ~dialog acts’ (e.g., Statement, Question, Backchannel, Agreement, Disagreement, Apology), which were hand-labeled in 1155 conversations from the Switchboard corpus of spontaneous human-to-human telephone speech. We developed several models and algorithms to automatically detect dialog acts from transcribed or automatically recognized words and from prosodic properties of the speech signal, and by using a statistical discourse grammar. All of these components were probabilistic in nature and estimated from data, employing a variety of techniques (hidden Markov models, N-gram language models, maximum entropy estimation, decision tree classifiers, and neural networks). In preliminary studies, we achieved a dialog act labeling accuracy of 65% based on recognized words and prosody, and an accuracy of 72~o based on word transcripts. Since humans achieve 84% on this task (with chance performance at 35%) we find these results encouraging.
منابع مشابه
Dialog Act Modeling for Automatic Tagging and Recognition of Conversational Speech
We describe a statistical approach for modeling dialog acts in conversational speech, i.e., speechact-like units such as Statement, Question, Backchannel, Agreement, Disagreement, and Apology. Our model detects and predicts dialog acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialog act sequence. The dialog model is based on treating the d...
متن کاملJohns Hopkins LVCSR Workshop-97 Switchboard Discourse Language Modeling Project Final Report
We describe a new approach for statistical modeling and detection of discourse structure for natural conversational speech. Our model is based on 42 ‘Dialog Acts’ (DAs), (question, answer, backchannel, agreement, disagreement, apology, etc). We labeled 1155 conversations from the Switchboard (SWBD) database (Godfrey et al. 1992) of human-to-human telephone conversations with these 42 types and ...
متن کاملAutomatic Detection of Discourse Structure for Speech Recognition and Understanding
We describe a new approach for statistical modeling and detection of discourse structure for natural conversational speech. Our model is based on 42 ‘Dialog Acts’ (DAs), (question, answer, backchannel, agreement, disagreement, apology, etc). We labeled 1155 conversations from the Switchboard (SWBD) database (Godfrey et al. 1992) of human-to-human telephone conversations with these 42 types and ...
متن کاملA data-driven methodology for the production of multilingual conversational systems
This paper describes a data-driven methodology for the design of multilingual conversational systems. The work presented here covers the various aspects of bootstrapping and deploying multilingual systems, such as phone set definition, acoustic modeling, language modeling, and language understanding. For the initial system domain, a Directory Assistance Service has been chosen. Whereas former a...
متن کاملRepresentation and Reasoning in a Multimodal Conversational Character
We describe the reasoning mechanisms used in a fully-implemented dialogue system. This dialogue system, based on a speech acts formalism, supports a multimodal conversational character for Interactive Television. The system maintains an explicit representation of programme descriptions , which also constitutes an attentional structure. From the contents of this representation , it is possible t...
متن کامل